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SUMMARY 


Task,  Objectives 

Phase  2  of  the  Adaptive  Segmentation  Evaluation  contract  addressed  the 
issue  of  improving  the  segmenter  performance  on  military  vehicles  in  ■  ' '  ' 

imagery  through  the  use  of  temporal  processing  techniques,  '^he'^  specific 
objectives  were  as  follows: 

Develop  temporal-based  techniques  to  augment  current  segmentation 
algorithms 

Develop  a  set  of  metrics  to  quantitatively  represent  segmenter 
performance  in  terms  of  quality  and  consistency  of  segmentation  -^  / 

3  Perform  a  comparative  study  of  performance  results  between  the 
modified  segmentation  approach  and  the  unaided  approach. 

Technical  Problems 
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A  study  of  the  usefulness  of  dynamic  scene  information  is  necessary  to 
fully  evaluate  the  options  associated  with  temporal-based  segmentation 
techniques.  The  purpose  of  this  study  is  to  identify  those  attributes  that 
are  most  readily  applicable  to  segmentation.  Subsequent  modifications  to 
the  segmentation  algorithm  will  depend  on  the  type  of  information  available 
and  the  optimum  point  of  application. 

General  Methodology 


Two  basic  approaches  for  using  temporal  properties  were  assessed. 

Each  of  these  approaches  is  based  on  a  different  definition  of  the  441 

segmentation  problem.  One  definition  states  that  inconsistent  segmentation 
results  are  due  primarily  to  the  inherent  sensitivity  of  the  algorithm 
methodology.  For  this  definition,  the  solution  would  be  to  enhance  the 
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algorithm.  The  second  definition  states  that  the  difficulty  in  developing 
an  algorithm  that  generates  consistent  results  is  due  to  the  high  degree  of 
data  variation  between  frames.  For  this  definition,  the  solution  would  be 
to  stabilise  the  data.  An  analysis  of  a  single  metric,  ERIM's  TIR^,  computed 
for  two  military  vehicles  (tanks  A  and  B)  over  a  sequence  of  20  consecutive 
frames  indicates  that  data  variation  (tank  B  varies  over  a  range  of  71.21) 
not  algorithm  sensitivity,  is  the  problem  (Figure  1). 
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Figure  1.  Variations  of  TIR^  Over  the  20-Frame  Set 
for  Tank  A  and  Tank  B 


For  this  reason,  the  methodology  concentrated  on  developing  techniques 
for  image  data  stabilization  rather  than  segmenter  enhancement.  The  justi¬ 
fication  for  adopting  this  methodology  is  that  a  more  consistent  input  sig¬ 
nal  would  obviate  the  need  for  special  case  processing  by  the  segmentor. 
Image  data  stabilization  was  accomplished  through  the  use  of  multiframe 
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data  integration  techniques.  These  techniques  attempted  to  smooth  the 
frame  to  frame  transition  of  Image  data  by  limiting  the  noise  effects  and 
other  image  ambiguities. 

Experimental  Methods 

A  set  of  experiments  was  defined  to  assess  the  effects  of  multiframe 
data  moothing  on  vehicle  signatures  and  segmenter  performance.  The 
ex'i.iments  consisted  of  applying  a  multiframe  data  smoothing  operator  and 
an  independent  frame  enhancement  operator  to  three  sets  of  consecutive 
sequences  of  ERIM  truthed  T1  images.  The  rule  directed  segmenter  was  then 
applied  to  the  raw  data,  smoothed  data,  and  independently  filtered  data. 
Finally,  a  comparative  analysis  of  segmenter  performance  was  conducted  by 
evaluating  the  segmenter  stability  metrics  on  each  of  the  three  data  types; 
and  data  variations  in  vehicle  signatures  were  analyzed  from  the  results  of 
the  data  variation  metrics. 

The  filters  used  for  the  multiframe  data  smoothing  experiments  were 
the  1x1x5  median,  1x1x7  median,  and  1x1x9  median.  Smoothed  data  from  the 
multiframe  mean  filter  were  not  significantly  different  from  the  median  to 
warrant  extensive  testing.  The  3x3x1  median  was  used  as  the  independent 
frame  enhancement  operator.  This  operator  allowed  comparison  with  a  more 
conventional  approach  to  noise  reduction. 

Discussion 

In  general,  the  experiments  conducted  on  the  test  data  sets  confirmed 
the  primary  strengths  of  multiframe  smoothing.  Both  the  conventional  and 
multiframe  filtering  improved  segmentation  results  compared  to  those  with 
the  raw  data  results.  The  primary  difference  was  in  the  behavior  of  the 
features  computed  on  the  three  data  types.  The  features  computed  on  the 
raw  data  and  conventionally  filtered  data  showed  random  fluctuations  and 
wide  distributions,  which  is  typical  for  FLIR  data.  The  features  computed 
on  the  multiframe  smoothed  data  were  better  clustered  and  showed  increased 
signal  qualities.  The  improved  feature  organization  and  higher  response 
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reflect  the  increase  in  data  stability  and  noise  reduction.  These  results 
have  two  important  consequences.  First,  the  improved  signal  quality 
greatly  reduces  the  need  for  special  purpose  processing  by  each  automatic 
target  recognition  (ATR)  component  to  compensate  for  image  ambiguities  in 
the  raw  data.  Second,  features  that  represent  higher  levels  of  structural 
detail,  which  are  usually  masked  by  noise,  can  be  computed  for  improved 
object  discrimination  and  classification  performance. 

Important  Findings  and  Conclusions 

The  results  clearly  indicate  the  advantages  of  multiframe  data 
smoothing.  These  results  also  emphasize  the  difficulties  that  exist  when 
image  characteristics  are  not  well  represented.  When  a  sensor  is  in 
motion,  scene  information  must  be  registered  prior  to  processing.  Bland 
image  conditions  do  not  provide  sufficient  feature  information  to  track 
with  the  degree  of  accuracy  required  for  multiframe  integration.  This 
situation  represents  a  constraint  of  the  multiframe  approach. 

The  problem  of  bland  image  conditions  is  not  solvable  through  the  use 
of  image  enhancement  techniques.  Such  techniques  do  not  improve  the  funda 
mental  elements  represented  in  the  data.  These  techniques  mainly  improve 
the  aesthetics  of  the  image.  The  bland-image  problem  must  be  addressed  at 
the  system  level.  A  viable  solution  is  to  switch  between  multiframe 
processing  and  independent  frame  processing,  based  on  the  success  of  frame 
to  frame  feature  tracking. 

Imolications  For  Further  Research 


The  most  important  advantage  of  multiframe  data  smoothing  is  improved 
signal  quality.  This  improvement  increases  segmenter  performance  and,  more 
importantly,  feature  stability.  The  increased  response  and  improved  clus¬ 
tering  of  the  metrics  for  the  smoothed  data  images  indicate  the  importance 
of  this  technique  for  object  classification,  A  comparative  study  of 
feature  selection,  feature  clustering,  feature  separability,  and  object 
classification  between  an  ATR  trained  on  raw-data  images  and  smoothed-dat a 
images  would  provide  a  total  assessment  of  multiframe  data  smoothing. 
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program. 
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1.0  OBJECTIVES  AND  APPROACH 


Phase  2  of  the  Adaptive  Segmentation  effort  was  concerned  with 
improving  segmenter  performance  on  military  vehicles  in  IR  imagery  through 
the  use  of  temporal  processing  techniques.  The  approach  concentrated  on 
developing  methods  for  image  data  stabilization  rather  than  segmenter 
enhancements.  The  logic  being  that  a  more  consistent  input  signal  would 
eliminate  the  need  for  special  case  processing  by  the  segmentation 
operator.  Image  data  stabilization  was  accomplished  through  the  use  of 
multiframe  data  integration  techniques.  These  techniques  attempt  to  smooth 
the  frame  to  frame  transition  of  image  data  by  limiting  the  effects  of 
noise  and  other  image  ambiguities. 

To  evaluate  multiframe  processing,  a  data  base  consisting  of  three 
sets  of  consecutive  sequences  of  ERIM  truthed  TI  imagery  was  created.  A 
multiframe  smoothing  operator  and  an  independent  frame  enhancement  operator 
were  applied  to  each  of  the  data  sets.  A  comparative  analysis  was  performed 
on  segmentation  results  for  each  set  of  consecutive  sequences  of  unprocessed 
(raw)  imagery,  independently  filtered  (enhanced)  imagery,  and  multiframe 
smoothed  imagery.  The  f rame-to-f rame  consistency  was  analyzed  for  both  the 
structural  properties  of  the  vehicles  and  the  segmentation  results  for  each 
data  set. 

To  assess  the  structural  stability  of  a  vehicle,  a  set  of  local 
i ntens i ty- based  metrics  was  computed  for  each  data  type  in  a  test  set.  The 
truth  silhouettes  provided  by  ERIM  were  used  in  computing  the  metric 
values.  Structural  consistencies  were  assessed  by  examining  the  variation 
in  the  distributions  of  each  of  the  metrics.  The  process  used  the  degree 
of  frame  to  frame  correlation  of  each  metric  to  determine  the  structural 
stability  in  the  data  properties  represented  by  the  metrics.  Variations  in 
signal  quality  were  determined  by  comparing  the  metrics  for  the  two  filter¬ 
ed  image  sets  to  the  metrics  for  the  raw  imagery.  The  polarity  of  their 
differences  indicated  an  increased  or  decreased  signal  response. 


To  assess  segmenter  performance,  a  comparative  study  was  conducted 
using  the  rule  directed  segmenter  (RDS)  as  the  control  algorithm  and  the 
segmentation  accuracy  metric  of  binary  area  cross  correlation  as  the 
performance  measure.  For  each  test  set,  the  RDS  was  applied  to  each  of 
the  three  data  types:  the  raw  data,  multiframe  smoothed  data,  and 
independent  frame  enhanced  data.  Performance  stability  was  determined 
by  examining  the  degree  of  frame  to  frame  consistency  in  the  segmentation 
accuracy  metric.  Performance  quality  was  measured  by  computing  the  aver¬ 
age  of  the  metric.  Improvement  in  segmentation  performance  was  determined 
by  comparing  the  response  of  the  metric  for  the  two  filtered  image  sets  to 
that  computed  on  the  raw  imagery. 


2.0  TEMPORAL  INFORMATION  ANALYSIS 


2.1  Definition  of  Temporal  Properties 

The  utility  of  dynamic  scene  information  is  universal,  extending  to 
all  elements  in  the  target-recognizer  system  architecture  (enhancement, 
detection,  segmentation,  feature  extraction,  and  classification),  as  well 
as  post-processor  functions  such  as  target  prioritization,  tracking,  and 
aimpoint  selection.  The  multiframe  approach  provides  the  opportunity  to 
improve  component-level  performance  and,  subsequently,  ATR  performance. 

The  overall  utility  of  multiframe  processing  and  the  key  attributes  of 
dynamic  scene  information  are  summarized  in  Table  2.1-1.  Table  2.1-1  shows 
that  two  scene  attributes,  platform  motion  and  temporal  statistics,  are 
most  readily  applicable  to  segmentation. 


TABLE  2.1-1 


Summary  of  Multiframe  Processing  and  Dynamic  Scene  Information 


Dynamic  Scene 
Attribute 

Application 

Utility 

Target  motion 

Moving  target 
indication  (MIT) 

Motion  as  detection, 
segoientation  cue 

Target  velocity  for 
prioritisation,  prediction, 
aspect,  aimpoint 

Motion  as  context 

Platform  motion 

Motion  stereo 

Scene  normalization 

Passive  ranging 

Terrain  and  object  3-D  relief 
Navigation 

Teoporal 

atatistica 

Sequential  compound 
decisions 

Classification  accuracy 

Consistent  segswntation 

Adaptive  preprocessor  thresholds 

Scene  history 

Scene  prediction 

A  prior  knowledge  for  next  frame 
Environment  evaluation 

Feedback  and  global  control 

All  of  the  above 

Intelligent  tracking 

Multitarget  track 

Track  through  obscuration 
Reacquire  after  breaklock 

s 
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The  effects  of  platform  motion  on  the  imagery  can  be  accurately  de¬ 
termined  by  applying  multiframe  processing  to  the  sequence  of  images. 

Motion  is  defined  in  terms  of  direction  and  magnitude  of  displacement. 

These  parameters  can  be  effectively  used  for  frame  to  frame  registration 
and  scene  normalization. 

Temporal  statistics  of  the  dynamic  scene  improve  performance  by  basing 
statistical  decisions  on  ensemble  data  rather  than  single-event  data.  This 
capability  provides  adaptive  optimization  of  image  enhancement  parameters 
and  segmentation  consistency. 

2.2  Advantages  of  Temporal  Properties 

To  recognize  the  advantages  of  using  temporal  context  in  image  pro¬ 
cessing,  the  problems  associated  with  single  frame  processing  must  be 
understood.  The  two  major  problems  in  image  processing  are  data 
instability  and  image  degradation. 

For  any  given  frame  of  information,  an  operator,  such  as  a  segmenter, 
determines  the  optimal  result,  based  on  the  conditions  represented  in  the 
data.  If  data  conditions  vary  significantly  from  frame  to  frame,  the 
operator's  results  will  be  inconsistent;  and  these  inconsistencies 
propagate  through  each  component  of  the  ATR  system,  impacting  overall 
performance. 

A  second  problem  associated  with  single  frame  processing  is  image 
degradation.  Atmospheric  effects  such  as  attenuation,  diffusion,  and 
diffraction  can  affect  image  quality.  Sensor  effects  such  as  lens 
distortion,  focal  length,  and  vibration;  and  effects  associated  with  the 
pixelization  process,  can  also  affect  image  quality. 

2.3  Application  of  Temporal  Properties 

For  this  application,  multiframe  data  smoothing  is  defined  as  the 
process  of  operating  on  a  "M"  deep  stack  of  registered  images  to  reduce  the 
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independent  random  fluctuations  in  the  data,  while  improving  signal  quality 
and  stability.  When  this  process  is  applied  to  an  image  g(x,y)  that  is 
formed  by  the  addition  of  uncorrelated  noise  n(x,y),  the  noise  component  of 
that  image  decreases  as  the  number  of  integrated  images  increases.  The 
reduction  in  the  random  component  is  specified  by  the  equation: 


°g(x,y)  y'H  °n(x,y) 


where  a  “  standard  deviation. 


This  equation  shows  that  the  reduction  in  noise  is  inversely  proportional 
to  the  square  root  of  the  number  of  images  (M).  As  the  number  of  noisy 
images  becomes  large,  the  data  quality  approaches  that  of  an  uncorrupted 
signal.  However,  the  benefits  of  using  large  numbers  of  images  for  noise 
reduction  are  limited  by  the  natural  constraints  inherent  in  a  moving 
sensor.  Closure,  magnification  differences,  changes  in  perspective,  and 
information  masking  limit  the  number  of  images  that  can  be  effectively 
integrated.  The  number  of  images  for  multiframe  smoothing  is  therefore 
determined  by  the  range  to  the  vehicles  and  the  behavior  of  the  sensor 
(aircraft,  tank,  etc.). 


1^. 


We  have  tested  three  data  smoothing  techniques:  Ixlxn  median,  Ixlxn 
mean,  and  Ixlxn  conditional  mode-median.  The  "n"  factor  in  the  Ixlxn 
notation  relates  to  the  depth  of  the  filter  or  number  of  stacked  images. 


One  advantage  of  the  ixlxn  median  filter  is  that  the  median  is  not 
sensitive  to  single  sample  spike,  noise,  or  other  extremes  that  may  exist  in 
a  sample  set.  Another  advantage  is  that  the  number  chosen  to  represent  the 
sample  set  is  a  number  which  exists  in  the  sample  set.  A  third  advantage 
is  that  the  median  is  also  a  minimum  distance  number  when  computed  as 


L  1^1  -  a| 

1-1 


when  A  =  median 


The  Ixlxn  mean  filter  has  properties  similar  to  the  median  when  the 
samples  are  fairly  related.  Unlike  the  median,  the  mean  filter  considers 
all  numbers  in  the  sample  set  when  computing  a  result.  For  this  reason, 
the  mean  is  sensitive  to  extremes  for  small  sample  sizes,  and  the  number 
chosen  to  represent  the  sample  set  may  not  be  an  original  member  of  the 
sample  set.  Like  the  median,  the  mean  is  also  a  minimum  distance  number 
when  computed  as 


£  (X  -  A)' 
1=1 


when  A  =  mean 


The  conditional  Ixlxn  mode-median  filter  optimizes  the  sample 
selection  process.  The  median  filter  is  used  when  a  sample  set  contains 
unrelated  numbers.  When  a  single  value  occurs  more  than  once  in  a  sample 
set,  the  median  result  is  replaced  by  the  most  repeated  value  or  the  mode 
This  process  is  similar  to  assigning  a  probability  to  each  sample  value. 
The  selected  value  is  either  the  highest  probability  number  (mode)  or  the 
median  (when  equal  probability  exists). 


S! 


3.0  DEFINITION  OF  METRICS 

A  set  of  metrics  that  assesses  the  behavior  of  information  as  a 
function  of  time  were  defined.  The  metrics  fall  into  three  general 
categories:  sensor  variation  metrics,  segmenter  stability  metrics,  and 
optical  flow  stability  metrics. 

3.1  Sensor  Variation  Metrics 

The  sensor  variation  metrics  statistically  represent  fluctuations  in 
the  raw  data  prior  to  any  processing.  These  metrics  indicate  the  degree  of 
instability  in  the  image  acquisition  process  (from  sensor  to  digital 
format).  Since  the  metrics  can  only  be  computed  on  the  digitized  images, 
their  results  represent  the  accumulated  effects  of  each  processing 
component  in  the  image  acquisition  system.  System  fluctuations  are 
measured  from  the  variations  in  thermal  properties  of  the  vehicles  and 
their  local  background  over  a  sequence  of  "n"  consecutive  images.  The 
thermal  properties  of  the  object  and  the  background  for  any  given  image  are 
represented  by  the  intensity-based  metrics:  entropy,  contrast,  TIR^,  and 
TBIr2.  By  computing  the  variations  of  these  metrics  over  "n"  images, 
the  metrics  for  entropy  variation,  contrast  variation,  TIR^  variation, 
and  TBIr2  variation  are  derived.  The  variation  (V)  metrics  are  given  by 
the  equation: 


/E  (C  -  C)2 
!•=! 


where  C  are  the  metric  values,  C  is  the  average  metric  value,  and  n  is 
the  number  of  Images. 

"Characterization  of  ATR  Performance  in  Relation  to  Image  Measure¬ 
ments"  (ATRWG  working  document  12-12-84)  defines  these  four  metrics  to 
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represent  the  fluctuations  in  objec t-to- background  separability  in  terras  of 
1)  average  intensity  (contrast),  2)  background  intensity  variation 
(TIr2),  and  3)  object  and  background  intensity  variation  (TBIR^  and 
entropy ) . 

The  sensor  variation  metrics  show  the  effectiveness  of  data  smoothing 
in  stabilizing  the  signal  (noise  removal).  The  degree  of  data  stability  is 
determined  by  computing  the  percent  change  in  variation  before  and  after 
smoothing i  (AV)  which  is  given  by 


C  -  C  X  100% 
'  s  r ' 


where  C  is  the  smoothed-data  and  C  is  the  raw  data, 
s  r 


3.2  Segmenter  Stability  Metric 

The  segmenter  stability  metric  represents  the  ability  of  the  segmenter 
to  consistently  perform  over  a  sequence  of  "n"  consecutive  images. 
Consistent  segmenter  performance  is  defined  as  a  result  which  is  similar  to 
the  previously  generated  segmentation  result,  independent  of  the  quality  of 
the  segmentation.  This  means  that  a  segmenter,  which  consistently  segments 
50  percent  of  the  object,  would  have  a  higher  stability  measure  than  one 
that  oscillates  between  70  and  90  percent.  It  also  means  that  outputs  of 
segmented  objects  where  one  is  consistently  40  percent  and  the  other 
consistently  90  percent  have  the  same  stability  measure.  The  average 
quality  of  segmentation  for  each  object  is  also  computed.  This  combination 
of  measures  indicates  the  quality  and  stability  of  segmentation  for  each 
vehicle. 

The  segmenter  stability  metric  is  determined  by  computing  the 
variation  in  the  segmentation  accuracy  measure  of  binary  area  cross 


correlation  (BACC)  for  each  object  over  a  sequence  of  "n"  consecutive 
images.  Segmenter  stability  (SS)  is  given  by 


( BACC  -  BACC) 

x=l _ 

n  -  1 


The  segmenter  stability  metric  is  computed  over  the  objects  prior  to  and 
after  data  smoothing.  The  segmenter  stability  metric  provides  a  good 
indication  of  the  effectiveness  of  data  smoothing  in  stabilizing  the  output 
segmentation.  Success  in  achieving  segmenter  stability  is  determined  by 
examining  the  values  of  the  variation  metric  before  and  after  smoothing, 
vhile  the  degree  of  success  is  measured  by  computing  their  percent  of 
change.  The  percent  change  in  segmenter  stability  (ASS)  is  given  by 


SS  -  SS  X  100% 
'  s  r ' 


where  SSj.  is  segmenter  stability  for  smoothed  data  and  SSg  is  segmenter 
stability  for  raw  data. 

3.3  Optical  Flow  Stability  Metrics 


Optical  flow  is  derived  by  recording  the  frame  to  frame  displacement 
of  a  set  of  distinct  features  distributed  throughout  the  image.  A  feature 
is  selected  according  to  its  uniqueness,  relative  to  other  information  in 
its  local  neighborhood.  The  type  of  information  that  features  represent, 
such  as  a  tree  trunk,  road  segment,  or  manmade  object,  is  completely 
scenario  dependent.  Scenarios  that  afford  a  high  degree  and  variety  of 
scene  context  are  ideal  for  feature  selection  and  matching.  However,  as 
scene  conditions  become  indistinguishable,  such  as  in  a  desert,  the  process 
of  locating  and  tracking  distinct  features  becomes  very  difficult. 


The  uniqueness  factor  of  a  selected  feature  is  an  important  indicator 
of  the  likelihood  of  the  system  to  correctly  track  that  feature  over  time 
(sequence  of  "n"  images).  A  measure  of  uniqueness  is  computed  by  examining 
the  maximum  response  of  a  feature  relative  to  the  average  response  over  its 
local  neighborhood.  The  feature  uniqueness  (F)  is  given  by 

P  _  maximum  feature  ~  mean  feature 
feature  variance 

For  example,  if  the  feature  of  interest  was  contrast,  then  the  local  region 
representing  the  highest  level  of  contrast  would  be  selected.  Feature 
uniqueness  would  be  determined  by  computing  the  difference  of  highest 
contrast  and  average  contrast  normalized  by  the  contrast  variance.  The 
feature-uniqueness  metric  is  a  measure  of  reliability  when  performing  frame 
to  frame  registration  for  data  smoothing.  Low-uniqueness  features  have  a 
higher  probability  of  correlation  error  and  subsequently  registration 
error . 


Successfully  tracking  the  positional  changes  of  selected  features 
between  consecutive  images  is  accomplished  by  applying  full-intensity  area 
correlation.  The  degree  of  success  in  feature  matching  is  indicated  by  the 
correlation  coefficient  [p(i,j)],  which  is  given  by 


N,M 

^  [R(x,y)  -  R]  (L(x-J,  y-i)  -  L] 

P(i.j)  -  -  - 

I.M  _  o  N,M  _  , 

^  [R(x,y)  -  R]  ^  (L(x-J,  y-1)  -  L] 
x,y  x,y 


The  average  correlation  coefficient,  which  is  computed  over  the  "n" 
consecutive  images,  is  a  good  indicator  of  the  reliability  of  the 
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4.0  SYSTEM  APPROACH 


The  most  advantageous  attribute  of  multiframe  processing,  as  compared 
to  independent  frame  processing,  is  the  ability  to  integrate  information 
acquired  over  a  continuing  sequence  of  imagery.  When  a  sensor  is 
stationary,  the  integration  problem  involves  applying  a  data  smoothing 
operator  to  stabilize  the  signal.  However,  when  a  sensor  is  in  motion, 
the  problem  becomes  more  difficult.  As  a  sensor  moves,  the  information  in 
the  image  (field  of  view)  also  moves.  To  accomplish  multiframe  integration 
in  this  context,  the  image  smoothing  operator  must  be  preceded  by  frame  to 
frame  registration  of  the  data.  Since  scenarios  for  this  contract  specify 
moving  sensors,  our  system  approach  includes  the  extraction  of  scene  motion 
followed  by  frame  to  frame  registration. 

To  simplify  the  data  registration  process  and  limit  registration 
errors,  the  data  registration  operator  is  only  applied  to  subimage  windows 
that  pertain  to  the  vehicle  of  interest.  The  window  locations  are 
determined  by  the  detection  reports  generated  by  the  prescreener,  while 
window  sizes  are  based  on  estimated  object  size  using  interpixel  distance 
information  (IPD).  A  point  matching  operator  is  used  to  associate  each 
detection  report  with  a  corresponding  flow  vector  containing  the  x,y 
displacements  for  that  subregion  of  the  image  over  the  full  sequence  of 
images.  Once  the  data  registration  operator  has  been  run,  the  final 
process  is  the  application  of  a  data  smoothing  operator.  Our  overall 
system  approach  is  shown  in  Figure  4.0-1. 

4.1  Extraction  of  Scene  Motion 

In  regard  to  motion  extraction,  scene  motion  is  defined  as  the  frame 
to  frame  changes  in  position  of  scene  information.  The  system  approach  for 
extracting  positional  changes  of  scene  information  consists  of  using  an 
interest  operator,  partition/local  maximum  operator,  and  an  interest  point 
correlation  operator  (Figure  4.1-1). 


Smoothing  System  Overview 


Our  current  operator  is  called  the  size  contrast  operator  (SCO).  The 
SCO  (Figure  4.1-3)  is  designed  to  measure  the  level  of  contrast  between  an 
inner  size-grated  rectangle  and  an  outer  surrounding  size-grated  collar. 

The  difference  between  the  average  of  the  inner  window  and  the  outer  border 
region  is  computed  and  stored  as  an  image  metric  for  each  pixel  (Figure 
4.1-2,  upper  right  corner).  The  SCO  can  function  as  an  edge  operator  or  as 
a  detector  for  localized  regions  of  high  contrast  and  specified  size.  The 
advantage  of  the  SCO  is  that  it  accurately  locates  the  centroid  of 
localized  features;  the  major  disadvantage  is  that  it  is  not  effective  when 
the  scene  lacks  localized  regions  of  high  contrast  such  as  in  a  desert 
envi ronme  nt . 


Average 

Intensity  B  (x,y) 


Figure  4.1-3.  Size  Contrast  Window 


To  reduce  the  contrast  metric  image  to  individual  points  representing 
locations  of  local  maximum  contrast,  a  partition  and  local  maximum  operator 
is  applied  (Figure  4.1-2,  lower  left  corner).  The  procedure  consists  of 
partitioning  the  metric  image  into  "N"  windows  (In  this  representation  K  = 
16)  and  locating  the  highest  local  maximum  within  each  window.  The 
windows  are  recessed  from  the  edges  to  avoid  nominating  points  at  the  edge 
of  the  image.  This  method  of  point  nomination  has  two  basic  advantages. 
First,  by  nominating  the  most  distinctive  location  in  each  partition  the 


local  maximum  operator  maximizes  the  chances  of  successful  point  correla¬ 
tion.  Second,  the  resulting  point  distribution  is  by  definition,  nearly 
uniform  (Figure  4.1-2,  lower  right  corner  -  selected  points  overlaid  on 
intensity  image). 

Establishing  frame  to  frame  correspondences  is,  perhaps,  the  most 
difficult  step  in  the  multiframe  procedure.  Occlusion  of  regions  and 
regions  that  are  not  rigid  (e.g.,  smoke  or  vehicle  exhaust)  can  be 
difficult  to  match.  Also,  because  the  platform  is  constantly  moving, 
regions  continuously  enter  and  leave  the  field  of  view.  Full  intensity 
area  correlation  is  a  widely  used  and  well  understood  technique  for  solving 
the  correspondence  problem.  Correlation  is  applied  by  defining  a  reference 
window  centered  on  an  interesting  point  in  the  earlier  frame.  A  search 
window  which  is  several  pixels  larger  than  the  reference  window  is  defined 
in  the  current  live  frame  at  the  same  x,y  location.  A  template  the  size  of 
the  reference  window  is  then  moved  throughout  the  search  window  area.  The 
live  template  that  best  matches  the  reference  template  represents  the 
updated  location  of  the  interest  point  in  the  live  frame.  The  measure  of 
similarity  is  the  normalized  cross-correlation  coefficient,  which  was 
described  previously. 

When  the  live  template  and  reference  template  match  exactly,  the 
probability  (p)  is  1.  When  the  two  templates  are  exactly  inverted,  p  is 
-1.  If  the  reference  and  live  templates  are  totally  uncorrelated,  p  is  0. 
The  row  and  column  in  the  live  frame  where  the  correlation  is  maximized 
represent  the  location  where  the  live  template  best  matches  the  reference 
template.  This  change  in  location  of  an  interest  point  between  two  frames 
is  defined  as  optical  flow.  The  result  of  this  processing  on  all  interest 
points,  which  is  called  an  optical  flow  field,  is  a  quantification  of  the 
frame  to  frame  disparities  (Figure  4.1-4), 

Credibility  of  the  optical  flow  field  is  maintained  by  establishing  a 
goodness  criterion  in  the  form  of  a  correlation  threshold.  This  threshold 
reduces  the  risk  of  tracking  low  confidence  interest  points  that  do  not 


Figure  4.1-4.  Synthetic  Optical  Flow  Field 


accurately  model  the  true  scene  motion.  However,  the  correlation  threshold 
will  not  eliminate  those  points  that  do  not  conform  to  the  global  optical 
flow  because  of  overcorrelation.  Overcorrelation  can  occur  when  a  feature 
is  selected  by  the  interest  operator  in  a  window  containing  very  low 
contrast  or  cyclical  patterns.  The  correlation  coefficient  for  this  type 
of  point  equally  exceeds  the  threshold  at  a  number  of  locations,  which 
causes  the  maximum  to  be  deterministically  assigned.  To  purge  the  flow 
field  (Figure  4.1-5)  of  this  type  of  interest  point,  the  affine  transform 
is  used.  The  affine  model  is  a  first-order  approximation  to  the  optical 
flow  field.  The  affine  transformation  is  defined  as: 

x'  =  A[x  +  A2y  +  A3 

y'  =  B^x  +  B2y  +  B3. 

To  derive  the  affine  coefficients,  the  flow  field  is  least-squares  fit 
to  an  affine  sensor  model  and  the  residual  error  for  each  point  is  recorded 
The  flow  point  with  the  worst  residual  error  is  discarded,  and  the 
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Figure  4.1-5,  Flow  Field  Refinement 

reduced  list  is  fit  to  the  sensor  model  and  the  coefficients  recomputed. 
These  iterations  continue  until  either  all  flow  points  have  an  acceptably 
small  residual  error,  or  until  further  reduction  of  the  list  would  cause 
the  set  to  become  smaller  than  a  specified  minimum  number  of  flow  points. 
Once  the  final  set  of  affine  coefficients  has  been  computed,  the  residual 
error  for  each  point  in  the  original  set  is  recomputed.  Those  points  that 
exceed  the  maximum  affine  error  threshold  (currently  10)  are  removed  from 
the  list  and  replaced  by  a  new  entry. 


In  the  affine  model,  the  residual  error  is  defined  as  the  product  of 
the  magnitude  of  the  observed  flow  and  the  sine  of  the  angle  between  the 
predicted  flow  vector  and  the  observed  flow  vector.  If  the  cosine  of  the 
angle  is  less  than  zero,  the  residual  error  is  the  magnitude  of  the  actual 
flow. 

Accounting  and  maintenance  of  the  optical  flow  field  are  accomplished 
through  the  use  of  a  scene  history  file  (Table  4.1-1).  The  history  file 
provides  a  mechanism  for  accumulating  interimage  information  regarding  the 
selected  interest  point,  thereby  providing  a  historical  reference  of  scene 
history.  Key  information  in  the  history  file  includes; 

_1_  KAV  -  Entry  key  of  this  optical  flow  point  (key  access  value) 

2  FRM  -  Frame  number 
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2  ROW  -  Row  location  of  point  for  this  frame 


4  COL  -  Column  location  of  point  for  this  frame 


5  MET  -  Metric  value  of  interest  point  at  point  selection  time 


6  AVG  -  Average  of  metrics  in  window  partition 


2  SDV  -  Sigma  of  metrics  in  window  partition 


8  MET  -  Highest  correlatio*'  ‘^ficient  from  area  correlator 


9_  DIR  -  Quantized  direct  ri  indicating  orientation  of  point  change 


1 0  RDS  -  Row  displacement  c  a  point  between  last  and  current  frame 


1 1  CDS  -  Column  displacement  of  a  point  between  last  and  current 
frame 
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1 2  DIS  -  Total  distance  that  a  point  has  moved  between  last  and 
current  frame 

1 3  ATE  -  Current  affine  transform  error  of  a  point. 

4.2  Flow  Vector  Selection 

Once  a  scene  history  for  a  consecutive  sequence  of  images  has  been 
generated,  a  process  is  run  that  selects  the  optimum  flow  vector  for  each 
detection  report.  The  locations  of  the  detection  reports  are  extracted 
from  the  image  header  file,  which  corresponds  to  the  last  image  processed 
in  the  image  sequence.  The  vector  selection  program  computes  the  distance 
between  each  flow  vector  that  has  been  successfully  maintained  over  the 
entire  sequence,  and  the  detection  reports  (Figure  4.2-1).  The  flow  vector 
that  is  closest  to  each  detection  report  is  used  to  decompose  the  frame  to 
frame  positional  shifts  that  occurred  over  the  sequence  of  images  during 
the  application  of  the  subimage  registration  process. 


Figure  4.2-1.  Flow  Vector  Selection 


4.3  Data  Smoothing 


The  data  smoothing  algorithm  performs  multisubimage  registration 
(three  to  nine  images)  and  applies  a  pixel  smoothing  technique  between 
subimage  windows.  The  smoothing  process  is  acccomplished  by  applying  a 
IxlxN  filter  to  each  pixel  in  the  subimage  set,  where  N  equals  the  number  i 

of  registered  images  (depth);  the  filter  type  can  be  a  mean  or  median  or 
other  such  filter. 

A  prerequisite  for  this  process  is  the  existence  of  an  optical  flow 
history  file  generated  over  the  N  consecutive  images.  The  history  file 
contains  the  frame  to  frame  positional  changes  of  scene  context  that  occur 
over  the  sequence  of  images.  The  positional  changes  recorded  in  this  file 
are  used  to  register  the  subimages  extracted  from  each  full  frame  image. 

The  smoothed  images  are  generated  by  working  backwards  through  the 
history  file.  The  last  image  written  to  the  history  file  will  be  the  first 
image  processed.  Each  vehicle  in  an  image  is  processed  in  the  same 
manner.  The  general  concept  is  as  follows: 

^  Identify  an  optical  flow  vector.  Using  the  optical  flow  history 
file,  the  optical  flow  vector,  closest  to  a  detection  report 
(vehicle)  in  the  last  image  in  the  sequence,  is  identified.  The 
vector  must  have  been  tracked  through  all  the  images  in  the 
sequence.  This  flow  vector  is  used  to  register  all  subimages  in 
the  sequence  pertaining  to  this  vehicle. 

i 

I 

_2  Equalize  the  number  of  passes.  The  number  of  passes  (and  output 
smoothed  images)  equals  the  total  number  of  images  in  the  test 
sequence  minus  the  number  of  images  used  in  the  subimage  smoothing 
process  (called  a  cluster)  plus  1  (i.e.,  pass  =  total  -  cluster  + 

1). 

I 


3  Read  cluster  images.  For  each  cluster  (such  as  five  images  per 
cluster)  the  most  current  image  (newest)  is  the  master  image 
(Figure  4.3-1).  The  images  corresponding  header  file  is  used  to 
determine  the  extent  of  the  window  needed  to  fully  encompass  the 
detected  vehicle  and  to  include  enough  background  for  metric 
computation  (subimage  includes  vehicle  and  background  collar  area) 
Each  subsequent  image  in  the  cluster  is  read,  and  a  subimage  of 
equal  size  to  the  first  is  located  (using  the  flow  vector  to 
compensate  for  x,y  change)  and  stored.  When  all  five  images  have 
been  read,  registered,  and  stored,  a  1x1x5  filter  is  applied.  The 
output  filtered  subimage  is  then  placed  back  into  the  master  image 
After  each  vehicle  in  the  image  has  been  processed  in  this  manner, 
the  smoothed  master  image  is  written  to  disk.  The  next  newest 
image  is  then  read  along  with  its  set  of  four  consecutive  older 
images.  The  process  stops  when  a  cluster  of  five  images  cannot  be 
fo  rmed . 
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Figure  4.3-1.  Multiframe  Data  Smoothing 
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5.0  EXPERIMENTS 


A  set  of  experiments  were  defined  to  assess  the  effects  of  raultiframe 
data  smoothing  on  vehicle  signatures  at.d  segmenter  performance.  The 
experiments  consisted  of  applying  a  multiframe  data  smoothing  operator  and 
an  independent  frame  enhancement  operator  to  three  sets  of  consecutive 
sequences  of  ERIM  truthed  TI  images.  The  rule  directed  segmenter  was  then 
applied  to  the  raw  data,  smoothed  data,  and  independently  filtered  data.  A 
comparative  analysis  of  segmenter  performance  was  conducted  by  evaluating 
the  segmenter  stability  metrics  for  each  data  type.  Data  variations  in 
vehicle  signatures  were  analyzed  from  the  results  of  the  data  variation 
metrics. 

The  filters  used  for  the  multiframe  data  smoothing  experiments  were 
the  1x1x5  median,  1x1x7  median,  and  1x1x9  median.  The  results  obtained 
using  the  raultiframe  mean  filter  were  not  significantly  different  from  the 
median  to  warrent  extensive  testing.  The  independent  frame  enhancement 
operator  was  the  3x3x1  median.  The  inclusion  of  this  operator  enabled  us 
to  compare  the  results  against  a  more  conventional  approach  to  noise 
reduct  ion . 

The  primary  difference  between  the  raultiframe  median  and  the  local 
median  is  the  method  of  sample  set  selection.  The  multiframe  approach  uses 
sample  elements  (one  from  each  of  n  consecutive  frames),  each  representing 
the  same  local  area  of  information  in  the  image.  The  integration  of  this 
data  improves  signal  quality  without  jeopardizing  organizational  detail. 

The  3x3x1  median  uses  nine  sample  elements  (from  the  same  frame),  each 
representing  a  different  local  area  of  information  in  the  image.  The 
relationship  between  the  data  represented  by  the  nine  samples  determines 
whether  the  result  represent  signal  improvement  (all  samples  are  related) 
or  signal  degradation  (all  samples  are  unrelated).  Since  both  cases  occur 
within  the  image,  the  result  represents  a  combination  of  improved  and 
decreased  signal  quality.  This  property  of  the  3x3x1  median  makes  it 


undesirable  as  a  preprocessor  for  functions  that  require  a  precise 
numerical  representation  of  the  data,  such  as  feature  extraction.  However, 
the  ability  of  the  median  to  reduce  some  noise,  while  preserving  step 
edges,  is  useful  to  some  types  of  image  segmentation  operators.  In 
comparison,  the  multiframe  approach  is  idealy  suited  for  both  signal 
improvement  and  feature  enhancement. 

5.1  Test  Set  1 

The  first  data  set  tested  was  the  image  sequence  extracted  from  ERIM 
data  tape  number  3014-12,  set  4D.  The  data  tape  included  the  raw  imagery, 
ground  truth  information,  metrics,  and  truth  silhouettes.  The  first  30 
images  (44  total)  in  the  sequence  were  removed  from  the  tape,  using  the 
ATRWG  read  software. 

The  image  sequence  (Figure  5.1-1)  contains  two  tanks,  which  will  be 
referenced  as  tank-1  (the  rightmost  tank)  and  tank-2  (the  leftmost  tank). 
The  images  show  side  views  of  the  tanks  with  their  gun  barrels  in  the 
combat  position.  The  engines  and  wheels  are  hotter  than  the  bodies  of  the 
tanks,  which  indicates  that  the  tanks  are  either  in  motion  or  have  recently 
been  moving.  Tank-1,  a  T46,  is  approximately  1946  meters  from  the  sensor; 
Tank-2,  a  T95,  is  approximately  1919  meters  from  the  sensor.  Excluding  the 
tanks,  the  scene  is  virtually  void  of  any  significant  context  and  only  has 
a  gray  level  range  of  approximately  30  intensity  levels  (Figure  5.1-2). 

The  road  on  which  the  tanks  are  positioned  is  almost  nondetectable.  There 
appears  to  be  some  sort  of  runway  directly  above  tank-2.  The  lack  of 
context  in  this  sequence  makes  frame  to  frame  feature  correspondence  very 
difficult.  However,  the  close  range  of  the  tanks  permits  a  better 
assessment  of  the  effects  of  multifrarae  smoothing  on  the  structural  detail 
of  the  tanks,  segmenter  performance,  and  metric  behavior. 


Figure  5.1-1.  Tank  Image  Sequence 


Scene  Motion 


The  30-iniage  test  set  was  processed  using  the  scene  motion  extraction 
software.  The  parameters  for  the  point  selection  and  tracking  operators 
were  set  as  follows; 

_1  Size  contrast  inner  window  size:  9  pixels  wide,  7  pixels  high 

^  Partition  and  local  maximum:  6x6  grid  surface  (36  total  points) 

_3  Correlation  coefficient  threshold:  0.7  (less  than  0.7  is  deleted) 

4  Affine  error  threshold:  8.0  (greater  than  8.0  is  deleted) 

The  number  of  feature  points  tracked  for  the  entire  30-frame  sequence 
consisted  of  only  six  points  or  16  percent  of  the  initial  number  selected 
(Figure  5.1-3).  Four  of  the  nominated  points  pertained  to  contrast 
measures  between  the  two  tanks  and  their  local  background.  The  high 
turnover  in  feature  points  was  due  exclusively  to  poor  frame  to  frame 
correlation  due  to  bland  scene  conditions.  When  points  are  selected  in  low 
contrast  areas,  the  correlation  operator  is  highly  influenced  by  the  noise 
component  of  the  signal.  As  the  data  becomes  more  nearly  homogeneous,  the 
correlation  process  actually  atteraps  to  correlate  the  noise. 

The  flow  vectors  selected  for  performing  mul t i- subimage  registration 
for  the  two  tanks  were  vectors  3  (Table  5.1-1  -  for  tank-2)  and  -4  (Table 
5.1-II  -  for  tank-1).  Assessing  the  quality  of  these  vectors  from  the 
"Optical  Flow  Metric  Report"  (Table  5.1-III)  indicates  a  high  degree  of 
credibility  in  accurately  tracking  the  frame  to  frame  positional  changes  of 
the  two  tanks.  A  visual  assessment  of  the  correlation  accuracy  is  shown  in 
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Figure  5.1-3.  Optical  Flow  Field 


History  File  Report  Data  Record 


TABLE  5.1-III 


Optical  Flow  Metric  Report 
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F igure  5.1.4. 

The  bright 

white 

pixel  superimposed  on  the 

tanks  indicates 

the  x,y  locations  of  the  selected  feature  centroids.  For  tank-2 
(above  two  pictures)  the  contrast  feature  selected  was  in  the  wheels  of  the 
tank.  The  dominate  feature  for  tank-1  was  its  engine.  The  location  of  the 
feature  centroids  in  frame  1  compared  to  their  ending  position  in  frame  30, 
seems  to  indicate  that  an  accumulation  of  a  one-pixel  offset  may  have 
occured  over  the  30-frame  sequence.  If  the  correlation  drift  did  occur,  it 
had  no  detectable  effect  on  the  data  smoothing  results. 


Figure  5.1-4.  Scene  Correlation 


Data  Smoothing 

The  parameters  used  for  multiframe  smoothing  consisted  of  registering 
clusters  of  five  consecutive  subimage  windows  (placed  about  the  vehicles) 
and  applying  a  1x1x5  (row  by  column  by  depth)  median  filter.  This  process 
generated  a  data  set  consisting  of  26  images  (30  total  images  -  5  cluster 
sice  1).  These  26  raw  data  images  were  also  processed  using  a  conven¬ 
tional  3x3x1  median  filter. 

A  visual  comparison  of  the  effects  of  the  two  enhancement  techniques 
is  shown  in  Figure  5.1-5.  The  plots  depict  the  intensity  structure  of  a 
single  row  of  pixels  extracted  from  image  20050510 . IMG.  The  pixels  extend 
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across  the  subimage  window  of  tank-2,  passing  through  the  center  of  the 
vehicle.  The  upper  left  image  in  Figure  5.1-5  shows  how  the  multiframe 
approach  preserves  the  detail  on  the  vehicle.  The  center-left  image  shows 
the  influences  of  unrelated  information  on  the  filter  process.  Results 
from  a  5x5x1  median  filter  were  also  included  for  comparison.  A  more 
detailed  view  of  the  multiframe  filter  is  shown  in  Figure  5.1-6.  The  3-D 
projection  shows  the  ability  of  the  filter  to  preserve  surface  detail  on 
the  vehicle,  while  suppressing  noise  (most  visible  in  the  background  data) 
An  improvement  in  the  organization  of  the  tank  wheels  and  engine 
compartment  can  be  seen  in  the  gray  level  picture  of  the  tank. 
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Figure  5.1-6.  Noise  Reduction  Via  Multiframe  Median  Filter 
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To  determine  the  effects  of  the  two  enhancement  techniques  on 
segmenter  performance,  the  1x1x5  smoothed  data  set,  3x3x1  median-filtered 
data  set,  and  raw  data  set  were  processed  using  the  rule  directed  segmenter 
(Figure  5.1-7).  The  results  indicated  that  both  enhancement  techniques 
improved  segmenter  performance.  The  segmentation  accuracy  metric,  BACC , 
for  tank-2  (Figure  5.1-8)  reveals  that  the  level  of  improvement  is  almost 
identical  for  both  techniques.  This  implies  that  both  techniques  have 
properties  that  are  beneficial  to  segmentation. 
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Figure  5.1-7,  Segments  (RDS)  from  26  Consecutive  Frames  (Tank  2) 
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To  understand  how  the  two  enhancement  techniques  improved  performance, 
it  is  necessary  to  determine  how  each  technique  altered  the  data.  This  can 
be  accomplished  by  comparing  the  intensity-based  metrics  computed  on  the 
two  enhanced  data  sets  to  those  computed  on  the  raw  data  set,  A  summary  of 
those  results  is  presented  in  Table  5.1-IV  (1x1x5  median  versus  raw  data) 
and  Table  5.1-V  (3x3x1  median  versus  raw  data),  A  more  visual  examination 
of  the  intensity-based  contrast  metric  is  given  in  Figure  5.1-9.  The  plots 
compare  the  values  of  the  enhanced  data  metrics  to  the  raw  data  metrics 
computed  on  tank-2  (y  axis)  for  each  of  the  26  images  (x  axis)  in  the  data 
sets.  A  study  of  the  two  plots  shows  that  the  3x3x1  median  behaves  as  a 
low  pass  filter,  suppressing  the  high-frequency  information  and  subse¬ 
quently  reducing  the  metric  values.  The  general  profile  of  the  median 
graph  is  very  similar  to  the  raw  data  graph  with  the  exception  of  a  scale 
factor.  Conversely,  the  graph  of  the  1x1x5  median  filter  shows  a  reduction 
in  the  range  (vertical  extent)  of  the  metric  with  no  decrease  in  metric 
response.  The  results  demonstrate  the  ability  of  multiframe  filters  to 
reduce  noise  and  improve  signal  stability.  A  summary  of  all  five  intensity 
based  metrics  is  given  in  Figures  5.1-10  through  -14.  The  plots  depict  the 
distribution  of  the  metrics  for  the  two  tanks  (1  and  2)  for  each  of  the 


three  26  image  data  sets.  In  general,  the  important  aspects  of  the  plots 
are  the  organizational  features  of  the  metric  distributions  such  as  range, 
level  of  response,  and  clustering.  The  graphs  reflect  the  overall  superi¬ 
ority  of  the  multiframe  smoothing  approach  to  that  of  the  independent  frame 
filter. 

5.2  Test  Set  2 

The  second  data  set  tested  was  the  image  sequence  extracted  from  ERIM 
data  tape  number  3015-12,  set  40.  The  data  tape  included  raw  imagery, 
ground  truth  information,  metrics,  and  truth  silhouettes.  The  first  30 
images  (34  available  in  sequence)  were  removed  from  the  tape,  using  the 
ATRWG  read  software.  The  image  sequence  (Figure  5.2-1)  contains  three 
military  vehicles:  a  truck  (the  leftmost  object,  object-1),  a  T95  Tank 
(the  center  object,  object-2),  and  a  T32  Tank  (the  rightmost  object, 
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Figure  5.2-1.  Image  Sequence 
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Figure  5,2-2.  Intensity  Histogram  of  Channel  1 

Scene  Motion 

The  30-image  test  set  was  processed  using  the  scene  motion  extraction 
software.  Parameters  for  the  point  selection  and  tracking  operators  were 
set  as  follows; 

1  Size  contrast  inner  window  size:  5  pixels  wide,  3  pixels  high 


2  Partition  and  Icoal  maximum:  8x8  grid  surface  (64  total  points) 


3  Correlation  coefficient  threshold:  0.7  (less  than  0.7  is  deleted) 


4  Affine  error  threshold;  8.0  (greater  than  8.0  is  deleted). 

The  number  of  feature  points  tracked  for  the  entire  30-frame  sequence 
consisted  of  only  two  points,  or  3  percent  of  the  initial  number  selected. 
The  two  points  pertained  to  contrast  measures  between  the  two  tanks  and 
their  local  background.  The  low  contrast  of  the  truck  made  it  impossible 
for  the  frame  to  frame  correlator  to  track.  A  visual  review  of  the  tank 
flow  vectors  indicated  that  correlation  drift  made  them  unreliable  for 
multiframe  smoothing.  The  lack  of  a  reliable  optical  flow  history  file  for 
the  30-frame  set  made  it  necessary  to  create  one  manually  (Figure  5.2-3). 
Manual  generation  of  the  optical  flow  history  file  was  accomplished  by 
displaying  the  images  on  a  monitor  and  noting  the  x,y  positional  change  of 
each  vehicle,  using  a  cursor  which  controlled  a  minimum  encompassing  object 
box.  The  process  was  applied  using  a  zoom  factor  of  4  on  the  images  to 
minimize  registration  errors.  The  manual  tracking  process  revealed  the 
extensive  level  of  frame  to  frame  structural  variation  of  the  vehicles. 
These  structural  variations,  along  with  the  low  contrast,  made  the  manual 
tracking  process  about  75  percent  reliable. 


Data  Smoothing 

The  parameters  used  for  multiframe  smoothing  consisted  of  registering 
clusters  of  five  consecutive  subimage  windows  (placed  about  the  vehicles) 
and  applying  a  1x1x5  median  filter.  This  process  generated  a  data  set 
consisting  of  26  images  (30  total  images  -  5  cluster  size  +1).  A  second 
multiframe  smoothing  operator,  which  registered  clusters  of  nine 
consecutive  subimage  windows  and  applied  a  1x1x9  median  filter,  was  also 
used.  This  process  generated  a  data  set  consisting  of  22  images.  The 
consideration  of  nine  samples  in  place  of  five  attempts  to  further 
compensate  for  the  low  contrast  image  conditions.  In  addition,  the  same  26 
raw  data  images  were  processed  using  a  conventional  3x3x1  median  filter. 

To  determine  the  effects  of  the  enhancement  techniques  on  segmenter 
performance,  the  1x1x5  smoothed  data  set,  1x1x9  smoothed  data  set,  3x3x1 
median  filtered  data  set,  and  raw  data  set  were  processed  using  the  rule 
directed  segmenter.  An  assessment  of  the  segmenter  performance  results 
(measured  using  the  BACC  evaluation  metric)  indicated  that  none  of  the 
techniques  has  a  significant  effect  on  performance.  They  also  showed  that 
each  vehicle  was  affected  differently. 

Object-1  (the  truck  -  Figure  5.2-4)  had  a  small  decrease  in  segmenta¬ 
tion  accuracy  for  the  two  data  smoothing  filters  (1x1x5  and  1x1x9),  while 
the  local  median  (3x3x1)  improved  performance  slightly  (2  percent).  The 
effect  of  the  1x1x5  median  filter  on  object-1  can  be  seen  in  Figure  5.2-5, 
where  a  gray  level  threshold  of  61  was  applied  to  the  first  four  images  of 
the  1x1x5  median-filtered  object  and  the  raw-data  images.  The  threshold 
represented  the  best  number  for  object  to  background  separation  for  both 
data  sets.  A  greater  level  of  structural  consistency  can  be  seen  in  the 
multiframe  filtered  images.  However,  for  this  object  at  this  range,  the 
changes  in  performance  were  still  negligable. 

Object-2  (the  center  tank  -  Figure  5.2-6)  had  the  highest  segmentation 
accuracy  scores,  which  averaged  76  percent.  The  multiframe  smoothing 
filters  had  a  positive  effect,  reducing  the  degree  of  frame  to  frame 
performance  variation  for  this  vehicle,  but  no  effect  on  increasing  the 
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Figure  5.2-5.  Truck  Thresholded  at  61  (Raw  vs  Median) 

overall  segmenter  performance.  The  3x3x1  filter  had  less  of  an  effect  on 
performance  stabilization,  but  managed  to  increase  the  overall  performance 
average  by  1  percent. 

Obiect-3  (the  rightmost  tank  -  Figure  5.2-7)  produced  results  more 
typical  of  the  first  set  of  experiments.  The  multiframe  smoothing  filters 
increased  segmenter  performance  and  improved  the  frame  to  frame  structural 
stability  of  the  object.  The  3x3x1  filter  also  improved  segmenter 
performance,  but  did  not  improve  structural  stability. 


The  difficulty  in  obtaining  consistent  improvement  in  segmenter  per¬ 
formance  and  frame  to  frame  structural  stability  when  applying  the  multi¬ 
frame  smoothing  filters  is  due  primarily  to  the  manually  derived  optical 
flow  history  that  was  created  for  this  data  set.  To  confirm  this,  four 
attempts  were  made  at  tracking  the  frame  to  frame  positional  changes  of 
each  of  the  three  vehicles  through  the  30-frame  test  set.  Each  attempt 
produced  a  sliglitly  different  flow  history,  which  caused  the  performance 
results  to  differ.  Due  to  the  small  size  of  the  objects,  minor  inaccurac¬ 
ies  in  determining  the  x,y  vehicle  displacements  significantly  affected  the 
outcome  of  the  smoothing  process.  Misregistrations  can  be  more  easily  tol¬ 
erated  when  object  features  are  spatially  large;  however,  when  a  vehicles 
engine  consists  of  only  a  few  pixels,  a  one-pixel  offset  is  significant. 
From  the  variations  accumulated  among  the  four  manually  derived  optical- 
flow  history  files,  a  registration  error  of  approximately  25  percent  was 
estimated.  This  flow  error  makes  an  accurate  evaluation  of  the  multiframe 
smoothing  operatvir  difficult  for  this  data  set. 

Despite  the  frame  to  frame  registration  problems,  we  were  able  to 
ext'-act  positive  tendencies  of  the  data  smoothing  operators.  A  complete 
comparison  of  each  of  the  three  enhancement  methods  (1x1x5,  1x1x9,  and 
3x3x1)  is  shown  in  the  temporal  variation  metric  listings  (Tables  5.2-1 
through  -III).  The  different  responses  for  each  of  the  three  vehicles  sub¬ 
stantiate  the  d  i  f f  i  cu I i t  ies  in  generating  accurate  optical  flow  history  for 
this  test  set.  Nevertheless,  general  improvements  in  metric  response  and 
stability  are  evident  in  the  frame  to  frame  changes  in  several  of  the 
metrics.  A  comparison  of  the  entropy-based  contrast  metric  for  object-3 
(Figure  5.2-8)  shows  an  increase  in  metric  response  and  stability  for  the 
multiframe  smoothing  filters,  while  the  3x3x1  local  filter  is  still  very 
unstable.  This  trend  is  also  apparent  to  a  lesser  degree  in  the  inten¬ 
sity-based  TIK^  metric  for  object-2  (Figure  5.2-9).  This  ability  to 
improve  vehicle  characteristics  indicates  that  an  accurate  extraction  of 
optical  flow  shiuild  produce  more  favorable  results  than  those  currently 
generaf^oi.  Tiu'  results  also  indicate  the  importance  of  deriving  accuate 
opt  ic,)l  ;  1  ow  l’,i--t  )tv,  especially  for  vehicles  at  these  ranges  and  beyond. 
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Temporal  Variation  Metrics 
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The  third  data  set  tested  was  an  image  sequence  from  ERIM  data  tape 
number  3031-10,  set  40,  The  data  tape  included  raw  imagery,  ground  truth 
information,  metrics,  and  truth  silhouettes.  The  first  22  images  on  the 
tape  were  noncont inuous ,  unrelated  frames  of  data  taken  at  various  times  of 
the  day,  which  made  them  inappropriate  for  testing.  The  next  21  frames 
consisted  of  consecutive  sequences  of  images  digitized  at  1-second 
intervals,  with  the  exception  of  2  frames,  which  were  2  seconds  apart.  The 
21  frames  of  data  (Table  5.3-1)  were  removed  from  tape  using  the  ATRWG  read 
software. 

TABLE  5.3-1 

Image  List  for  Experiment  3 

2020000D  2020007D  2020015D 

2020023D  2020031D  2020039D 

2020047D  2020055D  2020063D 

2020071D  2020079D  2021007D 

2021015D  2021023D  20210310 

2021039D  2021047D  20210550 

20210630  20210710  20210790 

The  image  sequence  (Figure  5.3-1)  contains  three  military  vehicles:  a 
jeep  (the  leftmost  object,  object-1),  an  APC  (the  center  object,  object- 
2),  and  a  truck  (the  rightmost  object,  object-3).  All  three  vehicles  are 
positioned  at  side  view  aspects.  Objects-1,  which  is  approximately  6341 
meters  from  the  sensor,  has  the  most  uniformly  distributed  intensity 
contrast  of  the  three  vehicles.  Object-2,  approximately  6366  meters  from 
the  sensor,  has  the  best  organized  structure  (most  visibly  distinguishable) 
of  the  three  vehicles.  Object-3,  approximately  6415  meters  from  the 
sensijr,  has  bimodel  structural  characteristics.  Excluding  the  three 
vehicles,  the  scene  is  void  of  any  significant  context  and  only  has  a  gray 
lev.^1  range  of  approximately  20  intensity  levels  (Figure  5.3-2). 
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Scene  Motion 


The  21-image  test  set  was  processed  using  the  scene  motion  extraction 
software.  The  parameters  for  the  point  selection  and  tracking  operators 
were  set  as  follows: 

Size  contrast  inner  window  size:  5  pixels  wide,  3  pixels  high 

1  Partition  and  local  maximum:  8x8  grid  surface  (64  total  points) 

2  Correlation  coefficient  threshold:  0.7  (less  than  0.7  is  deleted) 

4  Affine  error  threshold:  8.0  (greater  than  8.0  is  deleted) 

The  feature  tracking  software  was  unsuccessful  in  maintaining  the 
positional  changes  of  any  of  the  features  selected  for  tracking.  This 
failure  was  due  to  the  low  contrast  of  the  imagery  and  the  1-second  spacing 
between  frames,  which  allowed  object  signatures  to  change  significantly. 

The  lack  of  an  optical  flow  history  file  for  the  21-frarae  set  made  it 
necessary  to  create  one  manually  (Figure  5.3-3).  Manual  generation  of  the 
optical  flow  history  file  was  accomplished  by  displaying  truth  images  on  a 
monitor  and  noting  the  x,y  positional  change  of  each  vehicle,  using  a 
cursor  that  controlled  a  minimum  encompassing  object  box.  The  process  was 
applied  on  the  images  using  a  zoom  factor  of  4  to  minimize  registration 
errors. 

Truth  images  were  used  in  place  of  raw  data  images  in  an  attempt  to 
improv^e  tracking  accuracy  and  to  avoid  dealing  with  the  low  level  contrast 
conditions  of  the  raw  data.  The  manually  derived  optical  flow  field 
depicts  the  significant  amount  of  positional  change  for  each  vehicle  over 
the  21-frame  set.  The  low  contrast  conditions  of  the  imagery  and  the 
manual  tracking  process  gives  the  derived  optical  flow  field  a  reliability 
rating  of  about  75  percent. 
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JEEP 


RPC 


TRUCK 


Figure  5.3-3.  Optical  Flow  Vectors  Created  for 
Each  Object 


Data  Smoothing 

Parameters  used  for  multiframe  smoothing  consisted  of  registering 
clusters  of  five  consecutive  subimage  windows  (placed  about  the  vehicles) 
and  applying  a  1x1x5  median  filter.  This  process  generated  a  data  set 
consisting  of  17  images  (21  total  -  5  cluster  size  +1).  A  second 
multiframe  smoothing  operator,  which  registered  clusters  of  seven  consecu¬ 
tive  subimage  windows  and  applied  a  1x1x7  median  filter,  was  also  used. 

This  process  generated  a  data  set  consisting  of  15  images.  The 
consideration  of  7  samples  in  place  of  5  attempts  to  further  compensate  for 
the  low  contrast  and  attempts  to  further  reduce  the  degree  of  frame  to 
frame  variation.  In  addition,  the  same  17  raw  data  images  were  processed 
using  a  conventional  3x3x1  median  filter. 

To  determine  the  effects  of  the  enhancement  techniques  on  segmenter 
performance,  the  1x1x5  smoothed  data  set,  1x1x7  smoothed  data  set,  3x3x1 
median  filtered  data  set,  and  raw  data  set  were  processed  using  the  rule 
directed  segmenter.  The  segmenter  (measured  using  the  BACC  evaluation 
metric)  improved  performance  accuracy  for  two  of  the  vehicles,  with  only  a 
slight  decrease  in  performance  for  the  third  vehicle.  The  results  also 
revealed  that  each  vehicle  was  affected  differently  by  each  of  the 
enhancement  techniques. 

Object-1  (the  jeep  -  Figure  5.3-4)  had  the  largest  increase  in 
segmenter  performance  of  the  three  vehicles  in  this  data  set.  The  best 
response  was  to  the  1x1x7  data  smoothing  filter,  which  increased  perform¬ 
ance  accuracy  by  an  average  of  44  percent.  All  but  one  of  the  1x1x7 
smoothed  images,  processed  by  the  rule  directed  segmenter,  had  improved 


accuracy.  The  least  successful  filter  was  the  conventional  3x3x1  median, 
which  still  improved  performance  accuracy  by  14  percent.  The  1x1x5  data 
smoothing  filter  increased  performance  accuracy  by  32  percent,  but  was  less 
stable  than  the  1x1x7  filter.  The  improved  performance  is  due  to  a  more 
accurate  frame  to  frame  object  registration  than  the  last  data  set  tested. 

Object-2  (the  APC  -  Figure  5.3-5)  also  had  improved  segmentation 
accuracy  scores  after  data  smoothing.  The  1x1x5  filter  and  1x1x7  filter 
showed  a  24  percent  and  22  percent  average  improvement  in  performance  over 
the  raw  data  results,  while  the  conventional  3x3x1  median  showed  a  5 
percent  increase.  Although  the  1x1x7  filter  had  a  2%  lower  performance 
gain  than  the  1x1x5  filter,  it  still  exhibits  a  high  degree  of  frame  to 
frame  stability.  The  benefits  of  raultiframe  data  smoothing  can  be  seen  by 
comparing  the  variations  on  structural  characteristics  of  the  APC  for  five 
consecutive  frames.  (Figure  5.3-6,  raw  data;  Figure  5.3-7,  1x1x7  smoothed 
data;  Figure  5.3-8,  3x3x1  smoothed  data).  The  highest  degree  of  structural 
similarity  is  seen  between  the  1x1x7  smoothed  data  images.  The  3x3x1 
filtered  images  show  the  smoothing  effect,  but  lack,  the  frame  to  frame 
consistency  seen  in  the  1x1x7  results. 

Object-3  (the  truck  -  Figure  5.3-9)  had  lower  performance  scores  than 
the  other  two  vehicles.  A  reduction  performance  accuracy  was  generated  for 
both  of  the  multiframe  smoothing  operators  and  the  conventional  3x3x1 
median  which  produced  the  worst  results.  Although  the  average  segmenter 
pertormance  of  the  1x1x7  median-smoothed  data  set  was  12  percent  lower  than 
the  raw  data  results,  an  increase  of  15  percent  was  achieved  in  frame  to 
frame  stability  of  results.  The  stability  factor,  an  important  property  of 
the  multiframi  .smoothing  approach,  is  also  extremely  beneficial  during  the 
fe.iturH  mapping  process  used  for  object  classification. 


Comparison  of  Binary  Cross  Area  Co 


1  3&3D  .  I  MG 


Nev  e  r  t  he  1 1- s  s  ,  we  were  still  able  to  extract  positive  tendencies  of  the 
data  smoothing  operators.  A  comparison  of  each  of  the  three  data  enhance¬ 
ment  met'nods  is  shown  in  Tables  S.3-II  through  -IV.  The  different 
respiinses  for  each  of  t!ie  three  vehicles  point  out  the  difficulties  in 
general  ing  accurate  optical  flow  history  for  this  test  set.  However, 
general  i  inpr  n’omer.t;  s  in  metric  response  and  stability  can  be  seen  in  the 
frame  to  frame  i:.i.aiiges  in  several  of  the  metrics.  For  example,  a 
compariscjn  of  the  intensity-based  TIR*^  metric  for  object-2  (Figure 
shows  an  increase  in  metric  response  and  stability  for  the 
multiiraine  smoothing  filters,  while  the  3x3x1  median  filter  shows  an 
Increase  i.n  metric  response  and  a  decrease  in  metric  stability.  The 
increased  metric  response  and  stability  reflect  the  beneficial 
characteristics  .>l  the  multiframe  smoothing  operator. 

These  test  results  support  the  conclusions  expressed  in  the  second 
experiment  -valuation,  which  emhpasized  the  importance  of  obtaining  an 
accurate  optical  t  l(:)w  history,  especially  for  vehicles  at  these  ranges. 


Temporal  Variation  Metrics 
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Binary  Area  Cross  Correlation  O  451  0.  395  O  056  12  45 


Object  Object  Standard  Deviation  Of  Metric  Central 
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The  temporal  processing  system  used  in  multiframe  integration  was 
designed  to  solve  the  two  major  conceptual  problems.  The  most  difficult  of 
the  two  problems  is  the  extraction  of  accurate  scene  motion,  defined  as  the 
frame  to  frame  positional  changes  of  scene  information.  More  specifically, 
scene  motion  entails  recording  the  x,y  location  of  specific  scene  context 
as  a  function  of  time.  The  accumulated  scene-motion  information  is  used  to 
align  subimage  windows  extracted  from  a  discrete  number  of  consecutive  data 
frames.  The  second  problem  is  determination  of  an  effective  technique  for 
integrating  the  stack  of  registered  subimage  windows.  The  integration 
technique  must  reduce  the  independent  random  fluctation  in  the  imagery  and 
improve  signal  quality  and  stability.  In  this  section,  we  present  our 
conclusions  and  make  recommendations  for  the  scene-motion  extraction  and 
multiframe-integration  software  designed  under  this  contract. 

Scene  Motion  Extraction 

The  implemented  system  for  extracting  scene  motion  consists  of  an 
interest  point  operator,  a  partition  and  local  maximum  operator,  an 
intensity-based  area  correlator,  and  an  optical  flow  noise  filter  (Figure 


f. 


6.0-1). 
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Figure  6.0-1.  System  for  Extracting  Scene  Motior. 
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Size  contrast  operator 

The  size  contrast  operator  (Figure  6.0-2)  emphasizes  unique  local 
regions,  based  on  intensity  information.  The  uniqueness  feature  of 
the  operator  is  contrast.  Contrast  is  a  fundamental  feature; 
texture,  gradient,  variance,  and  other  image  features  are  dependent 
upon  contrast.  This  constraint  provided  the  motivation  for  the  use 
of  this  feature. 

The  size  contrast  operator  has  the  advantage  of  being  size  adjust¬ 
able.  This  feature  allows  it  to  be  gated  for  specific  image  con¬ 
text.  In  our  experiment,  the  inner  window  was  set  to  object  size 
to  maximize  the  potential  for  nominating  vehicles  as  the  feature 
points  to  be  tracked.  Tracking  vehicles  is  highly  desirable  since 
the  mul t i frame- integrat ion  registration  process  requires  a  flow 
vector  near  each  vehicle.  When  the  flow  vector  directly  represents 
the  temporal  transformation  of  a  vehicle,  misregistration  errors 
are  minimized.  In  addition  to  the  size  criterion,  the  size  con¬ 
trast  metric  provides  information  about  the  integrity  of  each 
feature  (Figure  6.0-3).  Locations  where  the  metric  forms  high 
sharp  peaks  represent  well  organized  contrast  regions,  which  are 
well  suited  for  feature  tracking.  Locations  where  the  metric  is 
low  or  where  the  metric  is  constant  over  a  large  area  represent  low 
confidence  regions. 

The  overall  performance  of  the  size  contrast  operator  for  the  data 
sets  was  very  good,  considering  the  characteristics  of  the  imagery. 
The  close-range  image  data  set  was  almost  void  of  any  detail,  with 
the  exception  of  the  two  military  vehicles.  For  this  data  set, 
both  vehicles  were  selected  as  local  points  of  maximum  interest. 
Both  vehicles  were  sucessfully  tracked  through  the  entire  image 
sequence.  The  two-long  range  image  data  sets  were  void  of  any 
detail  and  contained  very  low  contrast  vehicles.  Neither  of  these 
images  sets  contained  characteristics  that  favored  feature 
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RAW  DATA  IMAGE  (20200000. IMG) 


Figure  6.C-2.  Size  Contrast  Metric  Image 

tracking.  Although  the  size  contrast  windows  were  optimized  for 
long  range  vehicles,  the  metric  responses  were  low  and  not  well 
organized.  As  a  result  the  selected  features  could  not  be  tracked 
throught  either  of  the  two  imo.ge  sets. 

Results  from  the  experiments  indicate  that  a  constraint  exists  when 
image  characteristics  are  not  well  represented.  Bland  image  condi¬ 
tions  do  not  provide  feature  information  considered  significant 
enough  to  track  with  the  degree  of  accuracy  required  for  multiframe 
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these  values,  a  determination  of  whether  to  select  a  feature  from  a 
partitioned  window  can  be  made.  In  addition,  when  features  are 
sparse  in  a  specific  area  of  the  image,  other  well  organized 
contrast  features  can  be  substituted.  This  upgrade  will  make  it 
possible  to  predict  the  accuracy  of  the  scene  motion  extraction 
subsystem  by  interpreting  the  strength  of  the  contrast  features 
selected  for  tracking. 

2  Partition  and  local  maximum  operator 

The  partition  and  local  maximum  operator  controls  the  selection  of 
feature  points  from  the  size  contrast  metric  image  and  the  spatial 
distribution  of  those  points.  Currently,  the  user  supplies  the 
partitioning  grid  size  to  the  operator.  The  grid  (Figure  6.0-4)  is 
an  effective  approach  for  controlling  the  spatial  distribution  of 
features.  The  feature  point  selection  process  is  easily  adaptive 
to  range.  To  increase  the  number  of  selected  points  required  for 
long  range  images,  the  grid  density  is  simply  increased.  The  grid 
size  selection  process  could  be  made  autonomous  by  using  ground 
truth  information  about  range  to  set  the  grid  parameters.  The 
recessed  boundary  of  the  grid  from  the  edge  of  the  image  assures 
that  each  selected  feature  has  an  opportunity  to  be  tracked.  If  a 
feature  is  too  close  to  the  edge  of  the  image,  a  full  correlation 
window  cannot  be  placed  about  it. 

The  partitioning  technique  has  one  major  drawback  in  that  features 
are  forced  to  be  selected  in  windows  that  are  void  of  any  signifi¬ 
cant  contrast.  Window  interpretation,  which  is  discussed  in  the 
size-contrast  evaluation  section,  would  alleviate  this  problem. 

The  optimum  location  for  this  upgrade  is  within  the  local  maximum 
selection  operator  (Figure  6.0-5).  This  operator  currently  selects 
the  location  of  maximum  metric  response  in  each  window  without 
regard  to  feature  credibility.  The  statistical  examination  of  each 
window  prior  to  point  selection  would  avoid  nominating  meaningless 
features  that  cannot  be  tracked. 
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3  Full  intensity  area  correlation 

Frame  to  frame  feature  matching  (tracking)  is  accomplished  through 
the  application  of  a  full  intensity  (all  8  bits)  area  correlator. 
Accurate  feature  tracking  is  imperative  for  successful  of  multi¬ 
frame  integration.  Misalignment  of  subframe  windows  due  to  feature 
registration  errors  degrades  the  multiframe  integration.  Instead 
of  reducing  noise  and  improving  signal  quality,  registration  errors 
add  additional  degenerative  effects. 

The  correlation  process  is  the  most  difficult  and  time-consuming 
operator  in  the  motion  extraction  subsystem.  The  correlation 
operator  is  applied  iteratively  to  each  feature  point  over  an  area 
defined  as  the  search  area  (Figure  6.0-6). 

For  36  feature  points  and  a  search  area  of  25  by  25  pixels,  22,500 
applications  of  the  correlator  are  required.  The  number  of  appli¬ 
cations  does  not  take  into  account  the  mathematical  computations 
necessary  to  compute  each  correlation  measure.  The  accumulation  of 
correlation  measures  over  each  search  window  represents  a  correla¬ 
tion  surface  (Figure  6.0-7). 

The  organization  of  the  correlation  surface  determines  the  degree 
of  similarity  between  each  contrast  feature  in  the  last  and  current 
frame.  A  close  examination  of  four  search  areas  (Figure  6.0-8) 
shows  the  variations  in  behavior  of  the  correlator  for  different 
contrast  features.  The  ideal  correlation  surface  would  depict  a 
singular  peak  representing  the  location  of  great  similarity 
( correlat ion  1 ) . 

The  success  of  intensity-based  area  correlation  is  totally  depend¬ 
ent  on  the  characteristics  of  the  features  being  matched.  In  the 
close-range  test  set,  the  correlation  operator  very  accurately 
determined  the  frame  to  frame  positional  changes  of  the  two 
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Figure  6.0-8,  Correlation  Surface  of  Various  Image  Features 


vehicles.  In  the  iatte’-  two  test  sets,  the  features  were  so  poorly 
represented  that  accurate  correlation  was  not  possible.  In  fact, 
the  manualy  derived  correlation  history,  which  was  required  to 
process  these  two  test  sets,  was  extremely  difficult  to  obtain  and 
v'as  only  partially  accurate. 

Although  the  intensity-based  area  correlation  has  inherent  weak¬ 
nesses,  it  is  still  one  of  the  better  feature-matching  techniques. 
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When  intensity  area  correlation  fails  due  to  poor  feature  represen¬ 
tation,  other  techniques  such  as  peak  intensity  matching,  feature 
vector-based  matching,  and  segmentation  centroiding  also  fail. 

A  viable  solution  to  poor  correlation  is  to  switch  between  multi¬ 
frame  processing  and  independent-frame  processing,  based  on  the 
success  of  the  correlators.  Most  correlation  problems  occur  with 
long-range  poor-contrast  images.  As  the  sensor  closes  on  the  scene 
or  as  contrast  conditions  improve,  multiframe  processing  can  be 
instituted. 

^  Optical  flow  noise  filter 

The  affine  transform  is  used  to  identify  and  to  remove  feature 
points  that  do  not  accurately  represent  the  frame  to  frame  posi¬ 
tional  changes  of  scene  context.  These  feature  points  are  unreli¬ 
able  for  use  in  multiframe  integration  processing.  Discrimination 
between  valid  and  invalid  feature  points  is  accomoplished  by  build¬ 
ing  a  model  of  the  scene  motion  and  by  comparing  the  history  of 
each  feature  mode.  To  initially  create  a  reliable  model,  a 
sufficient  number  of  valid  points  must  exist. 

We  were  unable  to  evaluate  the  effectiveness  of  the  affine  as  a 
noise  filter  because  none  of  the  three  data  sets  generated  enough 
valid  feature  points  for  the  affine  to  create  a  model  of  the  scene 
motion. 

Assessment  of  the  Multiframe  Integration 

The  experiments  conducted  on  the  three  test  sets  were  used  to  assess 
the  performance  of  the  data  smoothing  techniques  used  for  multiframe 
integration.  The  primary  filter  used  for  multiframe  data  smoothing  for  the 
three  test  data  sets  was  the  Ixlxn  median.  The  Ixlxn  mean  and  Ixlxn 
mode-median  filters  did  not  produce  significantly  different  results  to 
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warrent  continued  testing.  All  of  the  tests  consisted  of  running  the  rule 
directed  segmenter  on  the  raw  data,  multiframe  smoothed  data  (Ixlxn  median), 
and  conventionally  filtered  data  (3x3x1  median,  independent  frame  filter) 
generated  for  each  test  data  set.  We  conducted  a  comparative  study  of  the 
rule  directed  segmenter  performance  results  and  the  behavior  of  a  set  of 
features  computed  on  the  data  types. 

The  experiments  conducted  on  data  set  1  provided  the  best  overall 
results.  A  comparison  of  the  rule  directed  segmenter  applied  to  the  three 
data  types  for  data  set  1  accentuated  the  primary  strengths  of  multiframe 
smoothing.  Both  conventional  and  multiframe  filtering  improved  segmenta¬ 
tion  results  over  that  of  the  raw  data  results.  The  primary  difference  was 
in  the  behavior  of  the  features  computed  on  the  three  data  types.  The 
features  computed  on  the  raw  data  and  conventionaly  filtered  data  contained 
random  fluctations  and  wide  distributions,  which  are  typical  for  FLIR  data. 
The  features  computed  on  the  multiframe  smoothed  data  were  better  clustered 
and  showed  increased  signal  qualities.  The  improved  feature  organization 
and  higher  response  is  an  indication  of  the  increase  in  data  stability  and 
noise  reduction.  These  properties  have  two  important  consequences.  First, 
the  improved  signal  quality  greatly  reduces  the  need  for  special  purpose 
processing  by  each  ATR  component  to  overcome  image  ambiguities  found  in  the 
raw  data.  Second,  features  that  represent  higher  levels  of  structural 
detail  usually  masked  by  noise  can  be  computed  for  improved  object 
discrimination  and  classification  performance. 

The  experiments  conducted  on  the  other  data  sets  had  similar  results. 


*7  Both  of  the  test  data  sets  used  during  these  experiments  consisted  of  low 

contrast  images  void  of  any  significant  context.  These  conditions  made  it 
necessary  to  manually  derive  the  scene  motion  information  needed  for  frame 
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to  frame  registration  of  the  detected  vehicles.  The  scene  motion  informa¬ 
tion  was  estimated  to  be  73  percent  reliable.  Nevertheless,  the  overall 
r," 

r/  results  of  the  data  smoothing  process  were  positive.  Improvements  in  the 

r  ■ 

structural  characteristics  of  the  vehicles  were  evident  from  an  examination 


of  the  features  computed  on  the  three  data  types.  These  results  for  low 
contrast  images  characterized  by  scene  motion  information  demonstrate  that 
data  smoothing  is  successful  under  less  than  ideal  conditions. 


Multiframe  Integration  Using  Edge  Maps 

The  initial  research  using  multiframe  smoothing  consisted  of 
registering  and  integrating  subimages  of  raw  data  placed  about  a  detected 
vehicle.  All  subsequent  processing  was  performed  on  the  smoothed 
subiroages.  Additional  research  was  conducted  on  integrating  edge  maps 
generated  by  applying  the  composite  edge  operator  to  the  raw  images.  The 
edge  maps  were  handled  in  the  same  manner  as  the  raw  data  images.  The 
process  consisted  of  registering  and  integrating  subimages  of  edge  map 
information  place  about  a  detected  vehicle.  Test  data  for  set  3  contained 
one  jeep,  one  APC,  and  one  truck,  all  at  long  range  (over  6  kilometers). 
This  data  was  used  for  the  experiments.  The  basic  idea  was  to  use 
multiframe  data  smoothing  to  stabilize  the  edge  map  operated  on  by  the 
segmentation  algorithm.  The  edge  map  integration  process  consisted  of 
registering  a  set  of  five  edge  magnitude  subimages  (Figure  6.0-9)  and 
applying  the  1x1x5  median.  The  direction  associated  with  the  selected  edge 
magnitude  was  retained  as  the  direction  for  the  edge  point. 

A  comparison  of  the  raw-  and  smoothed  edge  images  (Figure  6.0-10) 
shows  the  benefit  of  multiframe  edge  map  integration.  The  properties  of 
stability  and  improved  organization  depicted  in  the  smoothed  edge  images 
are  consistent  with  those  seen  in  the  raw-data  smoothing  results.  The 
implication  from  these  experiments  is  that  data  smoothing  is  an  effective 
data  enhancement  function  when  used  as  a  pre-processor  or  as  an  imbedded 
algorithm  function. 
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Figure  6.0-9.  Multiframe  Edge  Smoothing  (Jeep  at  6000  Meters) 
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Figure  6.0-10.  Comparison  of  Independent  and  Multiframe 
Edge  Smoothing 
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